Efficient Dynamic Pruning with Proximity Support
نویسندگان
چکیده
Modern retrieval approaches apply not just single-term weighting models when ranking documents instead, proximity weighting models are in common use, which highly score the co-occurrence of pairs of query terms in close proximity to each other in documents. The adoption of these proximity weighting models can cause a computational overhead when documents are scored, negatively impacting the efficiency of the retrieval process. In this paper, we discuss the integration of proximity weighting models into efficient dynamic pruning strategies. In particular, we propose to modify document-at-a-time strategies to include proximity scoring without any modifications to pre-existing index structures. Our resulting two-stage dynamic pruning strategies only consider single query terms during first stage pruning, but can early terminate the proximity scoring of a document if it can be shown that it will never be retrieved. We empirically examine the efficiency benefits of our approach using a large Web test collection of 50 million documents and 10,000 queries from a real query log. Our results show that our proposed two-stage dynamic pruning strategies are considerably more efficient than the original strategies, particularly for queries of 3 or more terms.
منابع مشابه
Adaptive Predictive Controllers Using a Growing and Pruning RBF Neural Network
An adaptive version of growing and pruning RBF neural network has been used to predict the system output and implement Linear Model-Based Predictive Controller (LMPC) and Non-linear Model-based Predictive Controller (NMPC) strategies. A radial-basis neural network with growing and pruning capabilities is introduced to carry out on-line model identification.An Unscented Kal...
متن کاملTrie-Join: Efficient Trie-based String Similarity Joins with Edit-Distance Constraints
A string similarity join finds similar pairs between two collections of strings. It is an essential operation in many applications, such as data integration and cleaning, and has attracted significant attention recently. In this paper, we study string similarity joins with edit-distance constraints. Existing methods usually employ a filter-and-refine framework and have the following disadvantag...
متن کاملEfficient query processing and index tuning using proximity scores
In the presence of growing data, the need for efficient query processing under result quality and index size control becomes more and more a challenge to search engines. We show how to use proximity scores to make query processing effective and efficient with focus on either of the optimization goals. More precisely, we make the following contributions: • We present a comprehensive comparative ...
متن کاملEfficient Text Proximity Search
In addition to purely occurrence-based relevance models, term proximity has been frequently used to enhance retrieval quality of keyword-oriented retrieval systems. While there have been approaches on effective scoring functions that incorporate proximity, there has not been much work on algorithms or access methods for their efficient evaluation. This paper presents an efficient evaluation fra...
متن کاملDynamic beam pruning strategy using adaptive control
In large vocabulary continuous speech recognition, huge search space results in vast computational cost. While most pruning search strategies can reduce the computation, but the recognition rate often decreases. This paper aims to reduce the computation time without any sacrifice of the recognition rate. By means of the adaptive control theory, a novel pruning method is presented. It can automa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010